Abstract
Supervised cross-modal hashing has attracted many researchers. In these studies, they seek a common semantic space or directly regress the zero-one label information into the Hamming space. Although they achieve many achievements, they neglect some issues: 1) some methods of the classification task are not suitable for retrieval tasks, since they are lack of learning personalized features of sample; 2) the outcomes of hash retrieval are related to both the length and encoding method of hash codes. Because a sample possess more personalized features than label semantics, in this paper, we propose a novel supervised cross-modal hashing collaboration learning method called discrete Cross-modal Hashing with Relaxation and Label Semantic Guidance (CHRLSG). First, we introduce two relaxation variables as latent spaces. One is used to extract text features and label semantic information collaboratively, and the other is used to extract image features and label semantics collaboratively. Second, the more accurate hash codes are generated from latent spaces, since CHRLSG learns collaboratively feature semantics and label semantics by using labels as the domination and features as the auxiliary. Third, we utilize labels to strengthen the similar relationship of inter-modal samples via keeping the pairwise closeness. Label semantics are made full use of to avoid classification error. Fourth, we introduce class weight to further increase the discrimination of samples that belong to different classes in intra-modal and keep the similarity of samples unchanged. Therefore, CHRLSG model preserves not only the relationship between samples, but also maintains the consistency of label semantic during collaboration optimization. Experimental results of three common benchmark datasets demonstrate that the proposed model is superior to the existing advanced methods.
Similar content being viewed by others
Availability of Data and Materials
The data and materials used during the current study are available from the corresponding author on reasonable request.
References
Teng, L., Tang, F., Zheng, Z., Kang, P., Teng, S.: Kernel-based sparse representation learning with global and local low-rank label constraint. IEEE Trans. Comput. Soc. Syst. 1–15. https://doi.org/10.1109/TCSS.2022.3227406 (2022)
Ding, G., Guo, Y., Zhou, J.: Collective matrix factorization hashing for multimodal data. 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2083–2090, Columbus, OH, USA (2014)
Zhang, D., Li, W.-J.: Large-scale supervised multimodal hashing with semantic correlation maximization. In: Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, pp. 2177–2183. Quebec, Canada (2014)
Zheng, Z., Teng, S., Wu, N., Teng, L., Zhang, W., Fei, L.: Selected confidence sample labeling for domain adaptation. Neurocomputing 555, 126624 (2023)
Yu, E., Sun, J., Li, J., Chang, X., Han, X.-H., Hauptmann, A.G.: Adaptive semi-supervised feature selection for cross-modal retrieval. IEEE Trans. Multimed. 21(5), 1276–1288 (2019)
Zhang, L., Ma, B., Li, G., Huang, Q., Tian, Q.: Pl-ranking: A novel ranking method for cross-modal retrieval. In: Proceedings of the 24th ACM International Conference on Multimedia, pp. 1355–1364, New York, NY, USA (2016)
Shao, J., Zhao, Z., Su, F., Yue, T.: Towards improving canonical correlation analysis for cross-modal retrieval.In: Proceedings of the on Thematic Workshops of ACM Multimedia 2017, pp. 332–339, New York, NY, USA (2017)
Tang, J., Li, Z., Wang, M., Zhao, R.: Neighborhood discriminant hashing for large-scale image retrieval. IEEE Trans. Image Process. 24(9), 2827–2840 (2015)
Zhu, L., Shen, J., Xie, L., Cheng, Z.: Unsupervised visual hashing with semantic assistant for content-based image retrieval. IEEE Trans. Knowl. Data Eng. 29(2), 472–486 (2017)
Gu, X., Dong, G., Zhang, X., Lan, L., Luo, Z.: Semantic-consistent cross-modal hashing for large-scale image retrieval. Neurocomputing 433, 181–198 (2021)
Chen, Z.-D., Li, C.-X., Luo, X., Nie, L., Zhang, W., Xu, X.-S.: Scratch: A scalable discrete matrix factorization hashing framework for cross-modal retrieval. IEEE Trans. Circ. Syst. Video 30(7), 2262–2275 (2020)
Lin, Z., Ding, G., Hu, M., Wang, J.: Semantics-preserving hashing for cross-view retrieval. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3864–3872, Boston, MA, USA (2015)
Liu, H., Ji, R., Wu, Y., Huang, F., Zhang, B.: Cross-modality binary code learning via fusion similarity hashing. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6345–6353, Honolulu, HI, USA (2017)
Liu, S., Qian, S., Guan, Y., Zhan, J., Ying, L.: Joint-modal distribution-based similarity hashing for large-scale unsupervised deep cross-modal retrieval. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1379–1388, New York, NY, USA (2020)
Liu, X., Hu, Z., Ling, H., Cheung, Y.-M.: Mtfh: a matrix tri-factorization hashing framework for efficient cross-modal retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 43(3), 964–981 (2021)
Qin, J., Fei, L., Teng, S., Zhang, W., Liu, D., Zhao, G., Yuan, H.: Discrete semantic matrix factorization hashing for cross-modal retrieval. 2020 25th International Conference on Pattern Recognition (ICPR), pp. 1550–1557, Milan, Italy (2021)
Qin, J., Fei, L., Zhu, J., Wen, J., Tian, C., Wu, S.: Scalable discriminative discrete hashing for large-scale cross-modal retrieval. 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4330–4334, Toronto, ON, Canada (2021)
Tang, J., Wang, K., Shao, L.: Supervised matrix factorization hashing for cross-modal retrieval. IEEE Trans. Image Process. 25(7), 3157–3166 (2016)
Wang, D., Wang, Q., He, L., Gao, X., Tian, Y.: Joint and individual matrix factorization hashing for large-scale cross-modal retrieval. Pattern Recog. 107, 107479 (2020)
Wang, Y., Luo, X., Nie, L., Song, J., Zhang, W., Xu, X.-S.: Batch: a scalable asymmetric discrete cross-modal hashing. IEEE Trans. Knowl. Data Eng. 33(11), 3507–3519 (2021)
Wu, F., Wu, Z., Feng, Y., Zhou, J., Huang, H., Li, X., Dong, X., Jing, X.Y.: Supervised discrete matrix factorization hashing for cross-modal retrieval. 2018 5th IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS), pp. 855–859, Nanjing, China (2018)
Xu, X., Shen, F., Yang, Y., Shen, H.T., Li, X.: Learning discriminative binary codes for large-scale cross-modal retrieval. IEEE Trans. Image Process. 26(5), 2494–2507 (2017)
Zhang, P.-F., Li, C.-X., Liu, M.-Y., Nie, L., Xu, X.-S.: Semi-relaxation supervised hashing for cross-modal retrieval. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 1762–1770, New York, NY, USA (2017)
Zhou, J., Ding, G., Guo, Y.: Latent semantic sparse hashing for cross-modal similarity search. In: Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 415–424, New York, NY, USA (2014)
Wang, L., Yang, J., Zareapoor, M., Zheng, Z.: Cluster-wise unsupervised hashing for cross-modal similarity search. Pattern Recog. 111, 107732 (2021)
Jin, S., Yao, H., Zhou, Q., Liu, Y., Huang, J., Hua, X.: Unsupervised discrete hashing with affinity similarity. IEEE Trans. Image Process. 30, 6130–6141 (2021)
Teng, S., Ning, C., Zhang, W., Wu, N., Zeng, Y.: Fast asymmetric and discrete cross-modal hashing with semantic consistency. IEEE Trans. Comput. Soc. Syst. 10(2), 577–589 (2023)
Fang, X., Liu, Z., Han, N., Jiang, L., Teng, S.: Discrete matrix factorization hashing for cross-modal retrieval. Int. J. Mach. Learn. Cybern. 12(10), 3023–3036 (2021)
Chen, Y., Zhang, H., Tian, Z., Wang, J., Zhang, D., Li, X.: Enhanced discrete multi-modal hashing: More constraints yet less time to learn. IEEE Trans. Knowl. Data Eng. 34(3), 1177–1190 (2022)
Shen, H.T., Liu, L., Yang, Y., Xu, X., Huang, Z., Shen, F., Hong, R.: Exploiting subspace relation in semantic labels for cross-modal hashing. IEEE Trans. Knowl. Data Eng. 33(10), 3351–3365 (2021)
Wang, D., Gao, X., Wang, X., He, L.: Label consistent matrix factorization hashing for large-scale cross-modal similarity search. IEEE Trans. Pattern Anal. Mach. Intell. 41(10), 2466–2479 (2019)
Wang, S., Zhao, H., Nai, K.: Learning a maximized shared latent factor for cross-modal hashing. Knowl.-Based Syst. 228, 107252 (2021)
Fang, X., Jiang, K., Han, N., Teng, S., Zhou, G., Xie, S.: Average approximate hashing-based double projections learning for cross-modal retrieval. IEEE Trans. Cybern. 52(11), 11780–11793 (2022)
Ma, D., Liang, J., Kong, X., He, R., Li, Y.: Discrete cross-modal hashing for efficient multimedia retrieval. 2016 IEEE International Symposium on Multimedia (ISM), pp. 38–43. San Jose, CA, USA (2016)
Zheng, C., Zhu, L., Lu, X., Li, J., Cheng, Z., Zhang, H.: Fast discrete collaborative multi-modal hashing for large-scale multimedia retrieval. IEEE Trans. Knowl. Data Eng. 32(11), 2171–2184 (2020)
Wang, Y., Chen, Z., Luo, X., Li, R., Xu, X.: Fast cross-modal hashing with global and local similarity embedding. IEEE Trans. Cybern. 52(10), 10064–10077 (2022)
Teng, S., Huang, W., Zhang, W., Teng, L.: The cross-modal hash with tag and sample semantic enhancements. Journal of Jiangxi Normal University( Natural Science) 47(3),296–306 (2023)
Yao, T., Yan, L., Ma, Y., Yu, H., Su, Q., Wang, G., Tian, Q.: Fast discrete cross-modal hashing with semantic consistency. Neural Netw. 125, 142–152 (2020)
Zhang, W., Yang, X., Teng, S., Wu, N.: Semantic-guided hashing learning for domain adaptive retrieval. World Wide Web (WWW) 26(3), 1093–1112 (2023)
Zhang, D., Wu, X.-J., Liu, Z., Yu, J., Kitter, J.: Fast discrete cross-modal hashing based on label relaxation and matrix factorization. 2020 25th International Conference on Pattern Recognition (ICPR), pp. 4845–4850, Milan, Italy (2021)
Zhang, C., Li, H., Qian, Y., Chen, C., Gao, Y.: Pairwise relations oriented discriminative regression. IEEE Trans. Circ. Syst. Video Technol. 31(7), 2646–2660 (2021)
Teng, S., Zheng, Z., Wu, N., Teng, L., Zhang, W.: Adaptive graph embedding with consistency and specificity for domain adaptation. IEEE/CAA J. Autom. Sin. 10(11), 1–14 (2023)
Teng, S., Guo, L., Zhang, W., Teng, L.: The cross-modal discrete hash learning of tag embedding subspace. Journal of Jiangxi Normal University (Natural Science) 45(3), 305–313 (2021)
Zheng, Z., Teng, L., Zhang, W., Wu, N., Teng, S.: Knowledge transfer learning via dual density sampling for resource-limited domain adaptation. IEEE/CAA J. Autom. Sin. 10(12), 1–23 (2023)
Schönemann, P.H.: A generalized solution of the orthogonal procrustes problem. Psychometrika 31, 1–10 (1966)
Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: Labelme: a database and web-based tool for image annotation. Int. J. Comput. Vis. 77(1-3), 157–173 (2008)
Huiskes, M.J., Lew, M.S.: The mir flickr retrieval evaluation. MIR ’08, Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval, pp. 39–43, New York, NY, USA (2008)
Chua, T.-S., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: Nus-wide: a real-world web image database from national university of singapore. CIVR ’09, Proceedings of the ACM International Conference on Image and Video Retrieval, pp. 1–9, New York, NY, USA (2009)
Acknowledgements
I am very grateful to all those who helps me put these ideas and put them into practice.
Funding
This study is supported in part by the Key-Area Research and Development Program of Guangdong Province under grant 2020B010166006, the National Natural Science Foundation of China under grant 61972102, 62202107, 62176066, and Guangzhou Science and Technology Plan Project under grant 2023A04J1729.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interest
The authors declare that they have no competing interests.
Ethical Approval
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Teng, S., Huang, W., Wu, N. et al. Discrete cross-modal hashing with relaxation and label semantic guidance. World Wide Web 27, 4 (2024). https://doi.org/10.1007/s11280-024-01239-6
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11280-024-01239-6